Exercise: +-1 bug and center of an array problem
نویسنده
چکیده
A problem that is constantly cropping up in designing even the simplest algorithm or a program is dealing with ±1 bug, or one-off bug, when we calculate positions within an array, very noticeably while splitting it in half. This bug is often found in buffer overflow type of bugs. While designing one complicated algorithm, we needed various ways of splitting an array, and we found the lack of general guidance for this apparently minor problem. We present an exercise that tracks the cause of the problem and leads to the solution. This problem looks trivial because it seems obvious or insignificant, however, treating it without outmost precision can lead to subtle bugs, unbalanced solution, not transparent expressions for various languages. Basically, the exercise is about dealing with ≤ < as well as n/2, n/2-1, (n+1)/2, n-1 and similar expressions when they are rounded down to the nearest integer and used to define a range. Mathematics never crashes program does. Wobbling center of array and other stories An array is a continuous buffer in the memory. It starts at position b and ends at position e. All positions between b and e (including b and e) belong to the array. The size or length of the array is n=e-b+1. The problem is apparently very simple: where is the center of the array and how does it relate to the size, start and end position? No more than that. There are several causes of the problem: 1. There is confusion in various languages about the initial index of an array. For example, by default, in PHP, C, C++ it is 0, in COBOL, Fortran, Smalltalk it is 1, other languages use the lowest value of index type like Ada, Pascal. The quick rules for one language are not transparent. 2. Expressions like n/2 and similar with integer division are rounded down to the nearest integer which is not obvious from the syntax. 3. Definition of center is vague. An array with even size does not have a central position. 4. A wrong assumption that the center of an array is obviously n/2 which you need to adjust because it does not work all the time. 5. The existing code in the literature or elsewhere did not follow any particular guidance. Reading such a code is a source of confusion. 6. The index that is used in the algorithm or program could have several hidden meanings throughout the code. 7. Foggy understanding of expression like (n-1)/2+1, (n+1)/2, n/2-1 when they are in integers rounded down to the lowest integer with a constant attempt to reason in mod 2. 8. Because the problem looks trivial, it is not given sufficient attention. 9. The treatment of indices does look somewhat magical even in very standard and praised programming literature. 10. Special cases like n=0 and n=1 are merged into the general solution. 11. There is additional confusion about using < or ≤ 12. Formula from mathematics is applied directly, rather than carefully reconstructed. 13. There is confusion about the usage of positional value as length value or the other way around. For example, 0 can be first position in an array, but as length it represents an empty array. 14. А headache about counting a number of elements in an array with unclear specification for boundaries. How many numbers do we have from 0 to 5? 6, 5, 4? In the exercise, we will use two simple examples: 01234 and 012345 and two values for finding a center: n/2 and (e-b)/2=s/2. These two examples are all what we need. We will explain what happens when the initial index is not 0. We explain everything using 0-index case because all other cases can be easily derived from it. All values are rounded down to the nearest integer, i.e. we are dealing with integer division. We use two ways of trying to reach the center: through (n±1)/2 and n/2±1. We will find in our analysis that the following are the best expressions for various programming languages. The goal was to find as few as possible items to memorize, and to find an easy way of extending them to various situations. In the work, we give a very detailed explanation and potential pitfalls for other expressions. We deliberately dig down to the last detail just to explain how even the most correct mathematical formula cannot prevent bugs. Final solution that handles central position left half right half Natural division 0 ≤ i < n/2 (n+1)/2 ≤ i < n Left+ division 0 ≤ i < (n+1)/2 (n+1)/2 ≤ i < n Right+ division 0 ≤ i < n/2 n/2 ≤ i < n Cut out center division left cut 0 ≤ i < (n+1)/2-1 (n+1)/2 ≤ i < n right cut 0 ≤ i < n/2 n/2 + 1 ≤ i < n For this entire table, you need to remember only one range 0 ≤ i < n/2, (n+1)/2 ≤ i < n which is obvious if you know the formula 2 1 2 List of the common rules Let us start with few examples to illustrate the problem. If we have value r, 0 ≤ r ≤ n, then b ≤ i ≤ e has length n = e-b+1 b < i ≤ e has length n-1 b ≤ i < e has length n-1 b < i < e has length n-2 b ≤ i < b+r has length r b ≤ i ≤ b+r has length r+1, which means that we must have r<n b+r ≤ i ≤ e has length n-r b+r < i ≤ e has length n-r-1 b ≤ i ≤ e-r has length n-r b ≤ i < e-r has length n-r-1 i ≤ e can be replaced with i <n only if e=n-1, which means that we are using default value for 0-index language or b has constant value 0 through the program, b=0 If we have the size of an array, in order to cover all possible combinations, similar to those above, we will use the general expression for ranges mostly in this form This expression is connecting the length n and the position [8] in a simple and obvious way: the number of positions is simply n=w-u. Additionally, it suits well for many programming languages that require a strict initial bound. If we have any other expression we compare with this expression and: • switching ≤ to < is a reduction by one and we have to subtract 1 from the difference • switching < to ≤ is a promotion by one and we have to add 1 to the difference Another advantage is that excluding elements from the beginning or end of an array becomes automatic. To exclude g elements from the beginning, we simply write g ≤ i <n (n is not the size of an array any longer, if we need to work with the new size it will be g ≤ i <g+(n-g), n-g is a new size) or if the initial position is b then b+g ≤ i <b+n (or b+g ≤ i <b+g+(n-g), where n-g is a new size). Excluding h elements from the end becomes 0 ≤ i <n-h or with other value for b, b≤ i <b+(n-h) (n-h is a new size of an array in this case). Remember that the left side of u ≤ i < w is inclusive and right side excluding. If we need k elements to the left of the element at p, excluding p, we have p-k ≤ i <p. If we need k elements including p, it becomes p-k+1 ≤ i < p+1 which becomes, after the promotion on the right side, and reduction on the left p-k < i ≤ p If we need k elements to the right of p, including p, the expression becomes p ≤ i < p+k, and excluding p it is then p+1 ≤ i <p+k+1 or after reduction on the left side and promotion on the right it becomes p < i ≤ p+k Both expressions are easy to understand. The expression p-k < i ≤ p is easy to understand because we say including p, this is the reason we have ≤ p. Equally p < i ≤ p+k is clear since we say excluding p, which is why we have p <. However, either way, they both have k elements. Overall this means that even the expression u < i ≤ w is keeping the rule n=w-u We show shortly how to use the rules for b ≤ i ≤ b+r. We notice that from expected ... ≤ ... < ... we have changed one < to ≤ which is a promotion, thus adding 1 to the result, so the number of positions in this expression is (b+r)-b+1=r+1 If we have only right and left bounds then we express the range as b ≤ i ≤ e with n=e-b+1 since we have a promotion on the right side. Overall we have this table Number of positions Used expressions Adjustment to w-u u ≤ i ≤ w n=w-u+1 ≤ ≤ +1 u ≤ i < w n=w-u ≤ < 0 u < i ≤ w n=w-u < ≤ 0 u < i < w n=w-u-1 < < -1 You could make a mnemonic rule that using < has a hidden penalty of -1/2, and on the other hand, ≤ has a cost of +1/2, if they are used together the penalties cancel each other, but if we use two < we have a total cost of -1, while two ≤ need +1 adjustment. To complete the summary we are adding: • switching from u ≤ i < w to the form b ≤ i < b+n u ≤ i < u+(w-u) • switching from u ≤ i < w to the form b ≤ i ≤ e u ≤ i ≤ w-1 Now, back to the division problem. Let us see what happens if we try to use the expression n/2 directly to split the array into half assuming b=0 Ranges with central position included number of elements n=-1 n=0 n=1 n=2 n=3 n=4 1 0 ≤ i < n/2 0 0 0 1 1 2 2 0 ≤ i ≤ n/2 1 or 0 1 1* 2 2* 3 3 n/2 < i < n 0 0 0 0 1 1 4 n/2 ≤ i < n 0 0 1* 1 2* 2 *includes central position Ranges that exclude some element around center if not the center itself number of elements n=-1 n=0 n=1 n=2 n=3 n=4 1 0 ≤ i < n/2 0 0 0 1 1 2 3 n/2 < i < n 0 0 0 0 1 1
منابع مشابه
Bed bug bite, as an important health and urban pest
Bed bug (Cimex lectularius) is an nocturnal insect that feeding on human blood at all stages of them life, and known as a human ectoparasite. Bed bug is usually feeding from the face, neck, arms, arms, shoulders, legs, and especially parts of the body that are bare during sleep. Persons reaction differently to bed bug bites and may experience minor itching to severe allergies. Symptoms usually ...
متن کاملOptimization of a Wideband Tapped-Delay Line Array Antenna
In this paper, an optimal approach to design wideband tapped-delay line (TDL) array antenna is proposed. This approach lets us control the array angular and frequency response over a wide frequency band. To this end, some design restrictions are defined and a multi-objective optimization problem is constructed by putting the individual restrictions together. The optimal weights of the TDL proce...
متن کاملWindowing Effects of Short Time Fourier Transform on Wideband Array Signal Processing Using Maximum Likelihood Estimation
During the last two decades, Maximum Likelihood estimation (ML) has been used to determine Direction Of Arrival (DOA) and signals propagated by the sources, using narrowband array signals. The algorithm fails in the case of wideband signals. As an attempt by the present study to overcome the problem, the array outputs are transformed into narrowband frequency bins, using short time Fourier tran...
متن کاملWindowing Effects of Short Time Fourier Transform on Wideband Array Signal Processing Using Maximum Likelihood Estimation
During the last two decades, Maximum Likelihood estimation (ML) has been used to determine Direction Of Arrival (DOA) and signals propagated by the sources, using narrowband array signals. The algorithm fails in the case of wideband signals. As an attempt by the present study to overcome the problem, the array outputs are transformed into narrowband frequency bins, using short time Fourier tran...
متن کاملFabrication of conical microneedles array using photolithography
Background and Aim: Microneedle technology has led to huge changes in the field of drug delivery medicine. Using microneedles, the drug can be injected locally, painlessly, and in very low and controlled doses with high precision. Local drug delivery through the skin with microneedles has many advantages over other methods of drug delivery. In this method, the drug does not enter the gastrointe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1402.4843 شماره
صفحات -
تاریخ انتشار 2014